An Automatic Language Identification Using Audio Features
نویسنده
چکیده
An automatic Language Identification (LID) is the task of automatically recognizing a language from the given spoken utterance. Language identification is used to identify the language of the particular audio and reduce the complexity of the audio sample. LID systems that rely on multiple language phone recognition language modeling (PRLM) and n-gram language modeling produces the best performance in formal LID evaluations. By contrast, Gaussian mixture model (GMM) systems, which measure acoustic characteristics, are far more computationally efficient but tended to provide inferior levels of performance. We have described here the efficiency of an LID system for two different languages namely English and Hindi. The evaluation of languages is done on the standard recorded databases, from which features are extracted using Mel-frequency cepstral coefficients (MFCC). The language models are done using PRLM and classification is done using Gaussian mixture model (GMM). The obtained results ensure that accuracy of LID is efficient for the chosen languages and the system performance is evaluated on both PRLM and GMM. Keywords-Language Identification, PRLM, GMM, MFCC accuracy.
منابع مشابه
Speaker Identification for Swiss German with Spectral and Rhythm Features
We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the ...
متن کاملMulti-Language Identification Using Convolutional Recurrent Neural Network
Language Identification, being an important aspect of Automatic Speaker Recognition has had many changes and new approaches to ameliorate performance over the last decade. We compare the performance of using audio spectrum in the log scale and using Polyphonic sound sequences from raw audio samples to train the neural network and to classify speech as either English or Spanish. To achieve this,...
متن کاملOffline Language-free Writer Identification based on Speeded-up Robust Features
This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...
متن کاملCombining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملDeep Neural Network Bottleneck Features for Acoustic Event Recognition
Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013